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Abstract 



We consider a robust model proposed by Scarf, 1958, 
for stochastic optimization when only the marginal 
probabilities of (binary) random variables are given, 
and the correlation between the random variables is 
unknown. In the robust model, the objective is to 
minimize expected cost against worst possible joint 
distribution with those marginals. We introduce the 
concept of correlation gap to compare this model to 
the stochastic optimization model that ignores corre- 
lations and minimizes expected cost under indepen- 
dent Bernoulli distribution. We identify a class of 
functions, using concepts of summable cost sharing 
schemes from game theory, for which the correlation 
gap is well-bounded and the robust model can be ap- 
proximated closely by the independent distribution 
model. As a result, we derive efficient approximation 
factors for many popular cost functions, like submod- 
ular functions, facility location, and Stcincr tree. As 
a byproduct, our analysis also yields some new re- 
sults in the areas of social welfare maximization and 
existence of Walrasian equilibria, which may be of 
independent interest. 
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1 Introduction 

Stochastic optimization models decision making un- 
der uncertain or unknown problem data. We con- 
sider stochastic optimization problems in which the 
uncertain variable is the "demand" set. For example, 
in stochastic network design problems, the random 
variable is the subset of source-destination pairs to 
be connected; in stochastic facility location problem, 
the random variable is the subset of potential clients 
that will have a demand; and in stochastic set cover 
problem, it is the subset of elements that need to be 
covered. In general, such a stochastic program can 
be expressed as 



(1) 



min2;ecIE[/(x, S)], 



where x is the decision variable which lies in a 
constrained set C, and the random subset S C V 
cannot be observed before the decisions x is made, 
/(x, S) is the cost function which depends on both 
the decision x and the outcome scenario S. The 
objective of stochastic programming is to minimize 
the expected cost, which depends on the joint 
distribution of items in V. 

In stochastic optimization, it is typically as- 
sumed that the distribution of random variable is 
either known or can be sampled from [1] [3l [H] . In 
this model, sample average approximation (SAA) 
has been used give approximation algorithms for 
many two-stage stochastic discrete optimization 
problems, including stochastic set cover [Tl], un- 
capacitated facility location [13], and Steiner tree 
problem [5]. Those models are suitable when one 
does have access to a lot of time invariant reliable 
statistical information. In this paper, we study 
the problem when information about a part of the 
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distribution (marginals) is known. In the case when 
only marginal probabilities pt of each element are 
available, a common heuristic is to assume that the 
distribution of random set S a product distribution. 
In other words, each element i may appear in S inde- 
pendently with a given probability pi. For example, 
see [8j[9]. However, there is a conventional wisdom 
that ignoring correlations can have catastrophic 
consequences. Examples can be constructed such 
that the cost of the solution optimized against the 
independent distribution performs very poorly once 
certain correlations are introduced. 

To address such problems. Scarf (1958, [13]) 
proposed a correlation-robust or distributionally- 
robust stochastic model, which minimizes the 
expected cost over distributions having a fixed 
marginal probability pi for each i Cz V, but with 
any possible correlations. For a problem instance 
(/j {Pi})j we wish to find 

(2) min^^ec g{x), 

where g{x) is the expected cost under worst-case dis- 
tribution when decision x has been made, given by 

, , maxx) E-d[/(x, S")] 

^ ' s.t. Es-.es^-DiS) ^ P^■ V^eF. 

We believe this is a very useful model because it 
takes advantage of the stochasticity of the input, 
and at the same time efficiently utilizes the available 
information. On the other hand, it defines an 
exponential size linear program which makes the 
problem potentially difficult to solve. A common 
strategy for such linear programs is to solve the 
corresponding dual LP with exponential number of 
constraints, using separating hyperplane approach. 
However, for the above model, approximating the 
separating hyperplane problem can be shown to 
be harder than the max-cut problem even for the 
special case when the function / is submodular in S. 

A natural question is how much risk it involves 
to simply ignore the correlations and minimize the 
expected cost of independent distribution instead 



of the worst case distribution. Or, in other words, 
how well the stochastic optimization model with 
independent distribution approximates the corre- 
lation robust model. The focus of this paper is to 
study this correlation gap. For a particular problem 
instance (/, V, {pi}) and a decision x, we define the 
correlation gap as the ratio between the expected 
cost E[/(x, S)] under the worst case distribution 
and that under the independent distribution on S. 
Correlation gap has many interesting implications 
for stochastic optimization problems. A small 
upper bound on correlation gap allows relaxation 
of the stochastic optimization problem under any 
distribution, including the worst case distribution 
model ([2]), to the product distribution case which 
is often more efficient to solve either by sampling 
or by other algorithmic techniques [HI dj. Further, 
in many real data collection scenarios, practical 
constraints can make it very difficult (or costly) to 
learn the complete information about correlations in 
data. In those cases, the correlation gap can provide 
a guideline to decide how important it is to spend 
resources on learning these correlations. In other 
words, it measures the "value of correlations" in the 
statistical data. Our main result is to characterize 
a wide class of functions for which the correlation 
gap can be well bounded. Wc also provide counter- 
examples showing large correlation gap for various 
other classes of functions. 

Below, we summarize our key results: 

• A class of functions with bounded correla- 
tion gap: For functions f{x, S) that are non- 
decreasing in S and have a cross-monotone, /3- 
budget balance, (weak) 77-summable cost-sharing 
scheme, we show that the correlation gap is up- 
per bounded by rj/S-^^. This will give correla- 
tion gap bounds (and matching approximation 
factors for robust model) of e/(e — 1) for sub- 
modular functions, O (logn) for facility location, 
and O (log^ n) for Steiner forest, where n = \V\, 
the size of ground set. 

• Hardness results: We show examples with corre- 
lation gap of ri(2") for functions supermodular 
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in S, r2(Y^loglogn/ logn) for monotone subad- 
ditive functions in 5*, and e/(e — 1) for submod- 
ular functions. These examples will also prove 
corresponding lower bounds on approximation 
factors that can be achieved by substituting in- 
dependent distribution for the robust model. 

• Polynomial-time algorithm for supermodular 
functions: Wc analytically characterize the 
worst case distribution when function f{x,S) 
is supermodular in 5, and consequently give a 
polynomial-time algorithm for the correlation ro- 
bust model provided / is convex in x. 

• New results for welfare maximization prob- 
lems: As a byproduct, our result provides a 
^(1 — l/e)-approximation algorithm for the 
well-studied problem of social welfare maximiza- 
tion in combinatorial auctions, when the util- 
ity functions are identical and admit (77, /3)-cost- 
sharing scheme. Notably, this implies (1 — 1/e)- 
approximation for identical submodular utility 
functions, matching the best approximation fac- 
tor (Vondrak, 2008 [H]) for this case. 

We also provide a simple counterexample for 
the conjecture by Bikhchandani [2j that markets 
that have buyers with identical submodular util- 
ities admit a Walrasian price equilibria. 

The rest of the paper is organized as follows. To be- 
gin. Section 2 will provide a mathematical definition 
of correlation gap, and examples showing large cor- 
relation gap for certain classes of cost functions. In 
Section 3, wc present our main technical theorem that 
upper bounds the correlation gap for a wide class of 
cost functions, and discuss its implications on vari- 
ous stochastic optimization problems and the welfare 
maximization problem. The proof of this theorem is 
presented in Section 4. Finally, in Section 5, wc end 
with a direct solution of correlation robust model for 
supermodular functions. 

2 Correlation Gap 

For a problem instance (/, V, {pi}) and at a given 
decision x, we define correlation gap as the ratio k. 



between the expected cost of the worst case distribu- 
tion and that of the independent distribution, i.e.. 



(4) 



ET,r4f{x,S)] 

Ep.[/(.T,5)]^ 



where is the independent Bernoulli distribution 
(also called product distribution) with marginals 
{pi}, and T)^ is the worst-case distribution (as given 
by ©). 

Suppose that for some particular cost function 
/, the correlation gap can be upper bounded above 
by K for all x, then it is not difficult to show 
that the decision obtained assuming independent 
distribution will give a ^-approximate solution to 
the corresponding robust optimization problem. 
More precisely, let xi is the optimal solution to the 
stochastic optimization problem ([T]) with indepen- 
dent Bernoulli distribution, and xji is the optimal 
solution to the correlation robust problem 12]). Then, 

g{xi) = EpR[/(a;/, S*)], and 

gixR) = ^v4fi^R.S)]>Ej,i[fixR,S)] 
> E-r,j[f{xi,S)] 

Using the bound on correlation gap at xj , this implies 

g{xi) < K g{xR) 

Unfortunately, for general cost functions, the correla- 
tion gap and hence the corresponding approximation 
factor can be large in order of rt, as demonstrated by 
the following examples. 

Example 1. [Minimum cost flow: ri(2") cor- 
relation gap for supermodular functions) 
(Sketch) Consider a two-stage minimum cost flow 
problem as in Figure [S] There is a single source 
s, and n sinks ti, • ■ • , in- Each sink ij has a 
probability pi = i to request a demand, and then a 
unit flow has to be sent from s to ti. Each arc (u, ti) 
has a fixed capacity 1, but the the capacity of arc 
(s,u) needs to be purchased at a cost c^ {x) in the 
first stage, and a higher cost c^^{x) in the second 
stage after the set of demand requests is revealed. 
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given by subadditive function 



Figure 1: An example with exponential correlation 
gap 



c^(x), c^^{x) are given as 



X < n — 1 



c"{x) = 2"a;. 



n + 2, X ~ n 



Given the first stage decision x, the cost of 
edges that need to be bought in the second stage 
to serve a set S of requests is given by: /(cc, S) = 
c^{x)+c"{\S\~x)+ = c-f(a;) + 2"(|5|-x)+. It is easy 
to check that f{x,S) is supermodular in S for any 
given X, i.e. /{x, 5Uz)-/(x, S) > fix, T\Ji)-J{x, T) 
for any S ^ T. The objective is to minimize the 
total expected cost c^{x) +E[/(a;, S)]. If the decision 
maker assumes independent demands from the sinks, 
then xi = n — 1 minimizes the expected cost, and 
the expected cost is n; however, for the worst case 
distribution the expected cost of this decision will 
be g{xi) = 2"-i + n - 1 (when Vy{V) = Pr(0) = 1/2 
and all other scenario have zero probability). Hence, 
the correlation gap at xi is exponentially high. 
A risk-averse strategy is to use the robust solu- 
tion xr = n, which leads to a cost g{xR) = n -\- 1. 
Thus, approximation ratio g{xi) / g{xR) = f2(2"). □ 

Example 2. {Stochastic set cover: VL{^Jri ^°f^°^"' ) 
correlation gap for subadditive functions) 
(Sketch) Consider a set cover problem with elements 
V = {1, . . . Each item j V has a marginal 
probability of l/K to appear in the random set S. 
The covering sets are defined as follows. Consider a 
partition of V into K = ^Jn sets Ai , . . . , Ak each 
containing K elements. The covering sets are all the 
sets in the cartesian product Ai x • • • x Ak- Each 
set has unit cost. Then, cost of covering a set S is 



c(5) 



max \Sr\Ai 

i=l....,K 



V5 C V. 



The worst case distribution with marginal probabili- 
ties = is one where probabilities Pr(5) = l/K 
for 5 = A,, i = 1, 2, . . . , K, and Pr(^) = other- 
wise. The expected value of c{S) under this distri- 
bution is K = ^/n. For independent distribution, 
c{S) = maxi=i^..._if Ci, where Q = \S D Ai\ are in- 
dependent (ii', l/i4r)-binomially distributed random 
variables. 

As K approaches oo, since expected value of re- 
mains fixed at 1, the Binomial(fsr, l/K) distribu- 
tion approaches the Poisson distribution with ex- 
pected value 1. Using some known results on max- 
ima of independent poisson random variables in [7], 
it can be shown that for large K, the expected value 
of the maximum of K i.i.d. poisson random vari- 
ables is bounded by Q{\ogK/ log log /C) (refer to Ap- 
pendix [X] for a detailed proof). This implies that 
IE[maXj^;^_ is bounded by 9 (log n/ log log n) 

for large n. So the correlation gap is atleast 
fl{y/nlog logn/ logn). 

To obtain approximation lower bound for two- 
stage stochastic set cover instance, extend the 
above instance as follows. For ease of notation, 
let L{n) = d log n/ log log n, where d is a constant 
such that E[maxi{Ci}] < L{n) . Let the first stage 
cost of a covering set to be = (1 + e)L{n)/ y/n 
for some small e > 0, and the second stage cost 



to be 



= 1. For a given first stage cover x, 



let B{x) be the set of elements covered by x, then 
f{x, S) = w^\x\ + c{S — B{x)). Using above analysis 
for function c(S), the optimal solution for indepen- 
dent distribution will be to buy no (or very few) 
sets in the first stage giving E[f{x,S)] < L{n) for 
independent distribution, but Q{^/n) cost for worst 
case distribution. On the other hand, the optimal 
robust solution considering worst case distribution 
is to cover all the elements in the first stage giving 
O {L{n)) cost in the worst case. Thus, approximation 
ratio g{xi)/g{xR) = ^(-ynloglogn/ logn). □ 

These examples indicate that using independent 
distribution may not always give a good approxima- 



4 



tion to the robust model. However, below we identify 
a wide class of functions for which correlations may 
be ignored to get efficient solutions for stochastic op- 
timization problems. 

3 A class of functions with low 
correlation gap 

A key contribution of our paper is to identify a class 
of cost functions for which the correlation gap is 
well bounded. To our interest, many popular cost 
functions including submodular functions, facility 
location, Steiner forest, etc. belong to this class, 
which will lead to efficient approximations for these 
problems. 

We derive our characterization using concepts 
of cost-sharing. A cost-sharing scheme is a function 
defining how to share the cost of a service among 
the serviced customers. We consider the class of 
cost functions / such that for every feasible x, 
there exists some cost-sharing scheme for allocating 
the cost f{x, S) among members of set S with (a) 
p-hudget balance (b) weak cross-monotonicity, and 
(c) weak rj-summability. Below we precisely state 
these properties. Since we assume that a; can take 
any fixed value, we will abbreviate f(x, S) as f{S) 
for simplicity when clear from the context. 

A cost-sharing scheme is cross-monotonic if it 
satisfies the property that everyone is better off 
when the set of people who receive the service 
expands [10]. Roughgarden et al [11] introduced an 
additional property of summability for cost-sharing 
schemes. Here, we will define a slightly weaker 
version of these properties by requiring them to 
hold only for given ordering on a subset of V. 
More precisely, we define a cost-sharing scheme as 
a function x{hS,as) that, for each element i E S 
and ordering as on S, specifies the share of i in 
S. The three properties of budget-balance, weak 
cross-monotonicity and weak summability are now 
stated as follows: 



1. (3-budget balance: For all S, and orderings as on 
S: 

.f{S)>j2xihS,as)>^ 

2. Cross-monotonicity: For all i G 5, 5 C T, as C 
aT'- 

x{i,S,as) > x{hT,aT) 

Here , as Q ax means that the ordering as is a 
restriction of ordering ax to subset S. 

3. Weak rj- summability: For all S, and orderings 

\s\ 

^x{'ie,Si,ast) < vfiS) 
1=1 

where ii is the element and Si is the set of the 
first t members of S according to ordering as- 
And, asi is the restriction of 175 on Si. Note 
that this is a weaker requirement than the con- 
ventional definition of summability, where a sin- 
gle cost-sharing function xii.S) must satisfy the 
given inequality for all orderings on the ground 

set [n]. 

We may re-emphasize that any cost-sharing scheme 
satisfying the conventional definition of /3-budget- 
balance, cross-monotonicity and ?7-summability (as in 
[TOl [TT] ) will always satisfy the above weaker condi- 
tions. However, this relaxation to weak conditions 
could give significant savings in approximation fac- 
tors for some cases. For example, submodular func- 
tions satisfy the above weak conditions with rj = 1 
and P — 1 for the incremental cost-sharing scheme: 

x{i,S,as)^ f{S,)- f{S,-,) 

where Si is the set of the first i members of S 
according to ordering as- On the other hand, for 
the conventional definition of summability, a lower 
bound of 77 > ri(logri) was shown for submodular 
functions in |11| . 

Let us call a cost-sharing scheme satisfying the 
above three properties an (77, /?)-cost-sharing scheme. 
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Also, we say that a function f{x, S) is non- 
decreasing in S if for every x and every S C T, 
fix,S) < f{x,T). Our main result is the following 
theorem, which we will prove in the next section: 

Theorem 1. For any instance {f,V,{pi}), if for all 
feasible x, the cost function f{x,S) is non- decreasing 
in S and has an {rj^ [3) -cost- sharing scheme for ele- 
ments in S , then the correlation gap is bounded as 

As described in Scction[2l this gives following corol- 
lary for approximating the correlation robust model: 

Corollary 1.1. For instances (/, y,{pi}) as defined 
in Theorem [21 an ri(3j^ approximate solution for 
correlation robust optimization problem can be con- 
structed by solving the corresponding stochastic opti- 
mization problem under independent distribution. 

Further, it is easy to show that for these functions, 
the variance under independent distribution is 
bounded by 0(J^-p-), where p = minijpi}. Thus, 
if the cost function is convex in x, these stochastic 
optimization problems may be solved efficiently 
using sample average approximation (SAA) method 
[1]. For specific problems, the structural simplicity 
provided by independent distribution may even elimi- 
nate the need of using sample average approximation. 

Before moving on to the proof of Theorem [l] 
let us briefly discuss its implications for various 
stochastic optimization problems, and for a seem- 
ingly unrelated problem of welfare maximization in 
combinatorial auctions: 

3.1 Stochastic optimization with sub- 
modular functions 

A function /i : 2^ ^ R is submodular if h{S U i) — 
h{S) < h{T Ui) ~ h(T) for all 5 D T, and i G V. 
These cost functions are characterized by diminishing 
marginal costs, which is common for resource alloca- 
tion problems where a resource can be shared by mul- 
tiple users and thereby the marginal cost decreases as 
number of users increases. As discussed earlier, for 



submodular functions 77 = l,/9 = 1. Therefore, The- 
orem [T] directly leads to the following corollary: 

Corollary 1.2. // the cost function f{x, S) is non- 
decreasing and submodular in S for all feasible x, then 
for any instance if,V,{pi}), the correlation gap is 
bounded by the constant ^z^- 

The next example shows the e/(e — 1) bound is 
tight for submodular functions. 

Example 3. (Tightness) Let V := {1, 2, . . . , 71}, 
define f{S) = 1 if 5 7^ 0, and /(0) = 0. Let each 
item has a probability p = Then the worst case 
distribution is Pr{{i}) = 1/n for each i Cz V, with 
expected value 1. The independent distribution has 
an expected cost 1 — (1 — ^ 1 — 1/e as n ^ 00. 

□ 

3.2 Stochastic Uncapacitated Facihty 
Location (SUFL) 

In two-stage stochastic facility location problem, any 
facility j G F can be bought at a low cost Wj in 
the first stage, and higher cost > Wj in the sec- 
ond stage, that is, after the random set S* C 1/ of 
cities to be served is revealed. The decision maker's 
problem is to decide x G {0, 1}'^', the facilities to be 
build in the first stage so that the total expected cost 
E[f{x, S)] of facility location is minimized (refer to 
[14] for further details on the problem definition). 

Given a first stage decision x, the cost function 
/(x, S) = ■ x-\-c{x, S), where c(x, S) is the cost of 
deterministic UFL for set S C V oi customers and set 
F of facilities such that the facilities x already bought 
in first stage are available freely at no cost, while any 
other facility j costs wj^ . For this deterministic UFL 
cost function there exists a cross-monotonic, 3-budget 
balanced, logj^j summable cost-sharing scheme [12]. 
Therefore, using Theorem 1, we get following bound 
on correlation gap: 

Corollary 1.3. The correlation gap for Stochastic 
uncapacitated facility location is bounded by 0{logn), 
where n ~ \ V\, the number of cities to be served. 

This observation reduces our robust facility lo- 
cation problem to the well-studied stochastic UFL 
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problem under known (independent Bernoulli) dis- 
tribution |14| at the expense of an O(logn) approxi- 
mation factor. 

3.3 Stochastic Steiner Tree (SST) 

In the two-stage stochastic Steiner tree problem, we 
arc given a graph G = {V,E). An edge e G E can 
be bought at cost in the first stage. The random 
set 5* of terminals to be connected are revealed in the 
second stage. More edges may be bought at a higher 
cost wl^,e G E in. the second stage after observing 
the actual set of terminals. Here, decision variable 
X is the edges to be bought in the first stage, and 
cost function f{x, S) = ■ x + c{x, S), where c{x, S) 
is the Steiner tree cost function for set S given that 
the edges in x arc already bought. Since a log^dS"!)- 
summablc, 2-budgct balanced cost sharing method is 
known for this cost function |12l H|. we can conclude: 

Corollary 1.4. The correlation gap for Stochastic 
Steiner tree is bounded by O(log^n), where n — \V\, 
the number of terminals to be connected. 

This observation reduces our robust problem to 
the well-studied (for example see [5]) SST problem 
under known (independent Bernoulli) distribution at 
the expense of an 0(log^ ri)-approximation factor. 

3.4 Welfare Maximization Problem 

Finally. Theorem [T] extends some existing results 
for social welfare maximization in combinatorial auc- 
tions. Consider the problem of maximizing total util- 
ity achieved by partitioning n goods among K players 
each with utility function f{S) for subset S of goods 
0. The optimal welfare OPT is obtained by following 
integer program: 



(5) 



maxa J2s^sf{S) 

as G {0,1}, yscv 



more general formulation of this problem that is often 
considered in the literature allows non-identical utility func- 
tions for various players. 



Observe that on relaxing the integrality constraints 
on a and scaling it by 1/K, the above problem re- 
duces to that of finding the worst-case distribution a* 
(i.e. one that maximizes expected value asfiS) 
of function /) such that the marginal probability 
J2s-ies'^s of each element is 1/K. Therefore: 

OPT < E„. [Kf{S)] 

Consequently, the correlation gap bound in Theorem 
[T] leads to the following corollary for welfare maxi- 
mization problems: 

Corollary 1.5. For welfare maximization problems 
with n goods and K players with identical utility func- 
tions f , the randomized algorithm that assigns goods 
independently to each of the K players with probabil- 
ity 1/K gives ^(1~ ^) approximation to the optimal 
partition; given that function f is non- decreasing and 
admits an [rj, fi)- cost- sharing scheme. 

Since rj = \,I3 — 1 for submodular functions, the 
above result matches the 1 — 1/e approximation factor 
provided by Vondrak |15j for this problem in case of 
identical monotone submodular functions. 

The reader may observe that even though approx- 
imating the worst case distribution directly provides 
a matching approximation for the corresponding wel- 
fare maximization problem, the converse is not true. 
In addition to having uniform probabilities Pi = l/K, 
solutions for welfare maximization approximate the 
integer program ([5]), where as the worst case distri- 
bution requires solving the corresponding LP relax- 
ation. The latter is a strictly harder problem unless 
the integrality gap is 0. A notable example is the 
above-mentioned case of identical submodular func- 
tions. This case was studied by Bhikchandani [5] in 
context of Walrasian equilibria who conjectured a 
integrality gap for this problem implying the exis- 
tence of Walrasian equilibria. However, in appendix 
[Cl , we show a simple counter-example with non- 
zero integrality gap (11/12) for this problem. As a 
byproduct, this counter-example proves that even for 
identical submodular valuation functions, Walrasian 
equilibria may not exist. 
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4 Proof of Theorem [T] 

For a problem instance (/, V, {pi}) and fixed x, use 
£ {f,V,{pi}) and {pi}) to denote the ex- 

pected cost of worst-case distribution and indepen- 
dent Bernoulli distribution respectively. In this sec- 
tion, we prove our main technical result that the cor- 
relation gap 



C{f,V,{P^}) 
Iif,V,{p,}) 



ie-l) 



when / is non-decreasing and admits (r/, (3) cost- 
sharing in S. As before, we will abbreviate f{x,S) 
as f{S) for simplicity. 

The proof is structured as follows. We first focus 
on special instances of the problem in which all Pi's 
are equal to 1/K for some integer K, and the worst 
case distribution is a "K-partition-type" distribution. 
That is, the worst case distribution divides the ele- 
ments of V into K disjoint sets {Ai, . . . ,Ak}, and 
each Ak occurs with probability 1/K. Observe that 
for such instances, the expected value under worst 
case distribution is £ (/, V,{pi}) = ■^Y.kfi^k)- In 
Lemma [TJ we show that for such "nice" instances the 
correlation gap is bounded by rjP-^^. Then, we use (7) 
a "split" operation to reduce any given instance of 
our problem to a nice instance such that the reduc- 
tion can only increase the correlation gap. This will 
show that the bound rjjd-^j for nice instances is an 
upper bound for any instance of the problem, thus 
concluding the proof of the theorem. 

Lemma 1. For instances (/, y,{pi}) such that (a) 
f{S) is non- decreasing and admits an (rj, (3)-cost- 
sharing scheme (b) marginal probabilities pi are all 
equal to 1/K for some integer K, and (c) the worst 
case distribution is a K-partition-type distribution, 
the correlation gap is bounded as: 



C{f,V,{l/K}) 



Proof. Let the optimal A'-partition corresponding to 
the worst case distribution is {Ai, A2, . . . , Ak}. As- 
sume w.l.o.g that /(Ai) > /(A2) > ... > fiAK). 
Fix an order a on elements of V such that for all 
k, the elements in Ak come before Ak^i. For every 



set 5, let as be the restriction of ordering a on set 
elements of set 5*. Let x is the (??,/3) cost-sharing 
scheme for function /, as per the assumptions of the 
lemma. Then by weak ry-summability of x- 

(6) 

I if, V, {1/K}) = EsQvifiS)] 

where the expected value is taken over independent 
distribution. 

Denote (j>{V) := E^gy [E1='i x(*i, ^i, ^5,)] • Let 
p = 1/K. We will show that 

HV)>il-pmv\Ai) + ^f{A^) 

Recursively using this inequality will prove the result. 
To prove this inequality, denote S-i = SO {V\Ai), 
Si = n ^1, for any S C V. Since elements in Ai 
come after the elements in in ordering (T5, note 

that for any £ < \S-il Si C S^i, and for £ > \S-il 
ii e Si. 



cj>{V) 



Mu=iX{n,Suas,)\ 



Since Sg C S\jAi, using cross-monotonicity of x, the 
second term above can be bounded as: 



(8) 



Because S-i and Si are mutually independent, for 
any fixed S-i, each i ^ Ai will have the same con- 
ditional probability p = 1/K of appearing in Si. 
Therefore, 

(9) 

MT}i%s^,\+iX{n,SijAi,as^A,)\ 

= E5_, [Es, Eh5_i|+i xin.S-i U Ai,as_,uA^)\S-i]] 

= P Es-iEisAi xihS-i U Ai,as_iuAi)] 

Again, using independence and cross-monotonicity, 
analyze the first term in the right hand side of ([7]), 
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(10) 



> (l-p)E5_JEl=r'x(»/,^/,^5,)] 
= {1 - p) <j>{V\A,) 



Based on (O, ^ and pU]) . and the fact that the cost- 
sharing scheme x is /3-budget balanced, we deduce 



Split: Given a problem instance {f,V,{pi}), and 
integers {rti > l,i £ V^}, define a new instance 
(/', V, {p'j}) as follows: split each item i e 1/ into rii 
copies CJ, Cj, . . . , C^. , and assign a marginal proba- 
bility of p'^i = ^ to each copy. Let V denote the 
new ground set containing all the duplicates. Define 
the new cost function /' : 2^ M. as: 

(13) f{S') = /(n(5')), for aU ^ C ^ , 

where 11(5") C y is the original subset of elements 
whose duplicates appear in S", i.e. 11(5") = {i G 
V\Cl e 5" for some fc G {1, 2, . . . , nj}. 

The split operation has following properties. Their 
proofs will be given in Appendix [B] . 



(11) 



(l-p) <f>iV\A,) 



> 
> 



Property 1. If f{S) is a non- decreasing function in 
S, then so is /'. 

pEs_i[El=i''x(*/,'5'_iUAi,cr5_iuAi)^j.Qpg^^y 2. If f{S) is non- decreasing in S, then 
Ejg^^ x(*j 'S'-i U Ai, cr^ juAi)] does not change the worst case expected 

value, that is: 



(1 - p) 0(y\Ai) + ip Es_, [/(5_i U A,] 

(l-p) q^{V\A^) + ^pfiA,), 

The last inequality follows from monotonicity of /. 
Expanding the above recursive inequality for A2, . . ., 
Ak, we get 



(12) 



1 



K 



<t>iv)>-py{i-pr-'fiA,) 



Since /(^fc) is decreasing in fc, and p = 1/A' by sim- 
ple arithmetic one can show 



■J2k=lPfi^k) K 

, •(i-i)-Ef=iP/(^.) 

By definition of (j){V), this gives: 



c{f,vAp^})^c{f',v'Ap',}) 

Property 3. // f{S) is non- decreasing in S, then 
splitting can only decrease the expected value over in- 
dependent distribution: 

I{f,V,{p.})>I{f',V',{p'^}). 

The remaining proof tries to use these properties 
of split operation for reducing any given instance to 
a "nice" instance so that Lemma [1] can be invoked 
for proving the correlation gap bound. 



> 



I{f,V,{l/K})> 



1 - 



1 



Proof of Theorem [II Suppose that the worst 
case distribution for instance (/, V, {pi}) is not a 
partition-type distribution. Then, split any element 
i that appears in two different sets. Simultane- 
ously, split the distribution by assigning probability 
as' ~ Q!n(S') to the each set S" that contains 
exactly one copy of i. Repeat until the distribution 
becomes a partition. Since each new set in the 
Next, we reduce a general problem instance to an new distribution contains exactly one copy of i, 
instance satisfying the properties required in Lemma by definition of function /', this splitting does not 



1 



C{f,V,{l/K}). 



□ 



[H We use the following split operation. 



change the expected function value. By Property [2] 
of Split operation, the worst case expected values 
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for the two instances (before and after splitting) 
must be the same, so this partition forms a worst 
case distribution for the new instance. Then, wc 
further spht each element (and simultaneously the 
distribution) until such that the marginal probability 
of each new element is 1/K for some large enough 
integer This reduces the worst case distribution 
to a partition , . . . , Ak such that each set Ak has 
probability \/K. Thus, the conditions (b) and (c) 
of Lemma [T] are satisfied by the reduced instance 

By the properties [21 E] of Split operation, the corre- 
lation gap can only becomes larger on splitting. So, 
we can focus on proving the correlation gap bound for 
the new instance. Now, let us consider the remaining 
condition (a) of Lemma [T] By Property [U the cost 
function /' obtained by splitting is non-decreasing. 
Given the original (77,/?) cost-sharing method x for 
/, we show that there exists a cost-sharing method 
x' for the new instance such that x' is (1) /3-budgct 
balanced (2) weak 77-summable, and (3) cross mono- 
tone in following weaker sense. is cross-monotone 
for any S' C T', us' C gt' such that os' , o't' respect 
the partial order A^, . . . , Ai of elements, and S' is a 
partial-prefix of T', that is, for some fc G {1, . . . , K}, 
S' CAKU---UAk, and r'\5" C U • • • U Ai. The 
construction of this cost-sharing scheme is given in 
appendix. Lemma [3l 

Thus, all the conditions in Lemma[T]are satisfied by 
the new instance except for the cross-monotonicity. 
The weaker cross-monotonicity that the new instance 
satisfies is actually sufficient to prove Lemma [1] To 
see this, observe that cross monotonicity is used only 
in Equation[51and[TUl and at both of these places, the 
required prefix condition is satisfied. Thus, Lemma[T] 
can be invoked to bound the correlation gap for the 
new instance, thereby completing the proof. □ 



5 Supermodular functions 

In the end, we directly consider the correlation robust 
model for cost functions f{x,S) which are supermod- 
ular in S. As shown in Section [21 the correlation gap 



for these cost functions can be exponentially high, 
so independent distribution does not give a good ap- 
proximation to the worst case distribution. However, 
it is easy to characterize the worst case distribution 
and directly solve the correlation robust model in this 
case. 

Lemma 2. Given that function f : 2^ ^ M. is super- 
modular, the worst case distribution over S has the 
following closed form 



Fi-iS) 




if S ^ Sn 

if S ~ Si,l < i < n 
ifS = $ 
o.w. 



where n = \V\; i is the i member of V and Si is 
the set of first i members ofV, both with respect to a 
specific ordering over V such that pi > ■ ■ ■ > Pn- 

The lemma is simple to prove, a proof appears in 
appendixlEl. Lemma [21 implies following corollary for 
solving the robust optimization problem. 

Corollary 2.1. For cost functions f{x,S) that are 
supermodular in S for any feasible x, the robust op- 
timization problem is simply formulated as: 



mmpnf{x,S" 



-1 



)-f}_^(p,-p,+ l)/(.T,^') + (l-pi)/(x, 



^Such an integer K can always be reached assuming piS are 
rational. 



Thus, if f{x,S) is convex in x and C is a convex 
set, then it is a convex optimization problem and 
can be solved efficiently. 
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A Maximum of Poisson Ran- 
dom Variables 

In this section, we show that the expected value of 
the maximum of a set oi M independent identically 
distributed poisson random variables can be bounded 
as 0(logM/loglogAf) for large M. 

Let A denote the mean, and F denote the distribu- 
tion of i.i.d. poisson variables Xi . Define G = 1 — F. 
Also define continuous extension of G: 

oo 

G,{x) ^ exp(-A) A'"+^Vr(a; + J + 1) 

Note that G{k) = Gc{k) for any non- negative integer 
k. Let {AkjfL^ is defined by Gc{Ak) = 1/k. Define 
continuous function L{x) = log(x)/loglog(x). Then, 
in [7], it is shown that for large k, Ak ~ L{k). 

We use these asymptotic results to derive a bound 
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on expectation of Z = maxi^i^....M for large M. 

oo 

E[Z] = '^Pi{Z > k) 

k=0 

\L{M^)-\ oo 

= Pr{Z > k)+ Pr{Z > k) 

fc=0 fc=rL(A/2)]+l 
/>oo 

(14) < L{M^) + 1+1 Pr{Z > x)dx 

Next, we show that the integral term on the right 
hand side is bounded by a constant for large M . Sub- 
stituting X — L{y) in the integration on the right 
hand side, we get 



poo 

/ Pr(Z > x)dx 



L(y)=L(KP) 



Pr(Z > L{y))L'{y)dy 



< I Pv{Z > L{y))-dy 

Jy=KP y 



Pr{Z>L{k)) 



IS a 



L'{y) denotes the derivative of funetion L{y). The 
last step follows because L'{y) < i for large enough 

y (i.e. if loglogy > 1). Further, since 
decreasing function in fc, it follows that 

-dy < E 



Pt{Z > L{y)) ^ ^ Pr(Z>L(fc)) 



Now, for large fc, L{k) ^ Ak, and 



Pt{Z > Afc) < 1 - (1 - GMk)r =1-1 



M 



Therefore, for large M, 

^ Pr(Z>£(fc)) ^ V--ifl-i 



k=M^ 



k=M^ 

< 1 



This proves that the integral term on the right hand 
side of p4)) is bounded by a constant, and thus, for 
large M : 

E[Z] < L(A/2) + 2 = 0(logA//loglogM) 



B Properties of Split Opera- 
tion 

Property [1] If f{S) is non-decreasing in 5 with an 
(77,/3)-cost sharing scheme, then so is /'. 

Proof. Monotonicity holds since for any S' C T' C 
V, n(S") C n(T'): 



f'{s') - /(n(5')) < /(n(r')) = /'(r') 



□ 



Property [2] If the cost function /(•) is non- 
decreasing in 5, then the splitting procedure does 
not change the worst-case expected value. That is: 

c{f,v,{p,})^cif',v'Ap'^}) 

Proof. For any fixed x, the worst case expected cost is 
the optimal value of following linear program, where 
{c^s}s<zv represents a distribution over subsets of set 
V: 
(15) 

ifi {Pi}) = maxQ asf{x, S) 

s-t- Hs: ies^s = Pi-, '^i &V 

Es"s = 1 

as > 0, V5 C V. 

Suppose item 1 is split into ni pieces, and each piece 
is assigned a probability Let {as} denote the 
optimal solution for the instance {f^V,{pi}), then 
we can construct a solution for the new instance 
(/': l^'i {p'j}) which has the same objective value by 
assigning non-zero probabilities to only those sets 
which have no duplicates. 



V5' c V, 



as' , if 5" contains no copies of item 1 
^as', if S' contains exactly one copy 
0, otherwise 
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dependence, 

i{f',v\{pr}) 
= Es' [ns') I{S' e A)] + [ns') I{S' i A)] 
= ^Escn{i}[/(5U{1})] 

+ (l-TT) E5cyui}[/(5)] 
< Pi Escn{i}[/(^U{1})] 

+ Escn{i}[/(^)] 

The second last inequality holds because tt = 1 — (1 — 
< pi, and < /(5U {!}) by monotonicity. 

□ 



C Integrality gap for SWM 
with identical submodular 
valuations 

Let V = {1,2,3,4,5,6}, -RT = 3, and construct a 
monotone submodular value function as 



One can verify that {a^/} is a feasible distribution 
(i.e., feasible to the linear program ((15])) for the 
new instance {f\V' ,{p'j}), and has the same objec- 
tive value as C{f,V,{pi}). Hence, £(/, {pj) < 
'C(f,V',^.}). 

For the other direction, consider an optimal solu- 
tion {a'g/} of the new instance. It is easy to see that 
there exists an optimal solution {a'^/} that a'gj = 
for all S' that contain more than one copy of item 
1. To see this, assume for contradiction that some 
set with non-zero probability has two copies of item 
1. By definition of /', removing one copy will not 
decrease the function value. Then, because of mono- 
tonicity of /', we can move out one copy to another 
set T that has no copy of item 1. Such T always ex- 
ists since the probabilities of copies of item 1 must 
sum up to pi < 1. So, we can assume that in the op- 
timal solution a'gi — for any set S' containing more 
than one copy. Thus, we can set as = a'g, where 
S is the corresponding original set for any S V . 
That forms a feasible solution for original instance 
with same objective value as C (/', V , {p'j})- We can 
apply the argument recursively for all the items to 
prove the lemma. □ 



Next, we prove that the expected cost under inde- 
pendent Bernoulli distribution can only decrease by 
the split operation. 



Property [3] If /(•) is non-decreasing, then after 
splitting 




if S* = 

2 if IS"! ^ 1 

3 if jsTl {1,2,3}| = 1 and jSTl {4,5,6}| = 1 

4 if |5n {1,2,3}| > 2 or IS-n {4,5,6}| > 2 



Then the optimal fractional solution to the LP relax- 
ation of ([5]) is given by 

"{1,2} = "{2,3} = "{1.3} = O-S) "{4,5} = "{5,6} = "{4,6} = 



with an optimal value 12; but the optimal integer 
solution will have an optimal value 11. So there is an 
11/12 integrality gap. 



I{f\V',{p'^})<I{f,V,{p.}). 

Proof. Let (/', V' , {p'j}) denote the new instance by 
splitting item 1 into nii pieces. Denote 

A := {S' C V'\S' contains at least one copy of 1}, 

and denote tt = Pr(S" e A). Consider the expected 
cost under independent Bernoulli distribution, by in- 



D Construction of cost-sharing 
scheme 

Lemma 3. Given {r],f3) cost-sharing scheme x for 
(f,V,{pi\), there exists a cost-sharing scheme x' for 
instance (/', V' , {p'i}) constructed by splitting in Sec- 
tion such that x' is (a) (3-hudget balanced (b) 
weak Tj-summable, and (c) cross monotone for any 
S' (^T' , as' Q ctt' such that S" is a partial prefix of 

r. 
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Proof. Given cost-sharing scheme x, construct x' as 
follows: Cost-share x' coincides with the original 
scheme x for the sets without duplicates, but for a set 
with duplicates, it assigns the cost-share solely to the 
copy with smallest index (as per the input ordering) . 
That is, any S" C V, ordering a'g, , and item (j-th 
copy of item i) in S' , allocate cost-shares as follows: 
(16) 



x'icis', 



Xih S, as), j = min{h : e S' 
0, o.w. 



of any element appears in each Ak- So, among the 
newly added copies T'\S' , any copy of an element 
of S can occur only in T' n Ak+i or later. Since 
S' C Ai U ■ ■ ■ U Ak, this means that for any element 
i ^ S, the newly added copies occur only later in 
the ordering and they cannot alter the order of low- 
est indexed copies of elements of S. This proves that 
as C ax- 

}, □ 



where S = n(S"), 0-5 is the ordering of lowest in- 
dex copies in a'^,, and min is taken with respect to 
the ordering a'g, . It is easy to sec that the property 
of (3 -budget-balance carries through to the new cost 
sharing scheme. Weak rj-summability holds since 

is'l \s\ 

^xiie^S'g,<^s'J ^^xiij,Sj,as^) < vfiS) 
i=i i=i 

= vfis') 

where S = 11(5"), as is the ordering of lowest index 
copies in a'g,. 

For cross- monotonicity, consider S' C T',as' C 
ax' such that S" is a "partial prefix" of T'. Now, 
for any z' G S", if i' is not a lowest indexed copy in 
T', then x(*', 2^', ""t') = ^'^ ^^^^ the condition is 
automatically satisfied. Let i' is one of the lowest 
indexed copies in T', then it must have been a lowest 
indexed copy in S", since S" is a subset of T', and 
as' C ax' ■ Thus, 

X(*', T', a'x,) = x(«, T, ax) < xih S, as) = x{i' , S' , a's,) 

where S = n{S'),T = n{T'), as, ax are the or- 
derings of lowest indexed copies in S" , T' respec- 
tively. Note that the inequality in above uses cross- 
monotonicity of Xi which is satisfied only if in ad- 
dition to S* C T, we have that as C ax- That 
is, if the ordering of elements of S is same in as 
and ax- Wc show that this is true given the as- 
sumption that as' , ax' respect the partial ordering 
Ak, - - - ,Ai, and 5" is a "partial prefix" of T'- That 
is, S" C Aa' U • • • U Ak, and T'\S' C U • • • U Ai 
for some fc. To see this, observe that the splitting 
was performed in a manner so that atmost one copy 



E Proof of Lemma [2] 

For any fixed x, the worst case expected cost is 
the optimal value of following linear program, where 
{<^s}scv represents a distribution over subsets of set 
V: 

(17) 

^ if, V, {Pt}) = maxa J2s o^sfix, S) 

S-t- Es: JGS"S =Pi, yi&V 

= 1 

as > 0, V5 C V. 

It is easy to verify that 

Pn ii S ^ Sn 

{pi ~ pi+i) if S ^ Si,l < i < n - 1 

l-pi if S* = 

o-W- 

is a feasible solution to (|17p . Next we show that it 
is actually the optimal solution. The dual of linear 
program (fT7)) is: 



(18) 



min^ A 7 + P^ ^ 

s.t. /(^)-E.e5A.<7, V5. 



Consider the problem in A for a given value of 7. This 
problem is to minimize a linear function with posi- 
tive coefficients (pi) over the supermodular polyhe- 
dron (of supermodular function f{S) — 7). Minimiz- 
ing a linear function over a supermodular polyhedron 
can be solved by a greedy procedure [5] , with the op- 
timal value given by Yl"=i Piifi^i) - /(S'i-i))- Then 
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([T5)) can be rewritten as 

n-l 

min^ 7 +Pn/(^") + Y.(P'' - Piim 

s.t. /(0) < 7- 

The optimal solution for above is 7 = /(0), therefore 
optimal value: 

This proves the lemma. 
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